30 research outputs found

    The Exploration-Exploitation Trade-Off in Sequential Decision Making Problems

    No full text
    Sequential decision making problems require an agent to repeatedly choose between a series of actions. Common to such problems is the exploration-exploitation trade-off, where an agent must choose between the action expected to yield the best reward (exploitation) or trying an alternative action for potential future benefit (exploration). The main focus of this thesis is to understand in more detail the role this trade-off plays in various important sequential decision making problems, in terms of maximising finite-time reward. The most common and best studied abstraction of the exploration-exploitation trade-off is the classic multi-armed bandit problem. In this thesis we study several important extensions that are more suitable than the classic problem to real-world applications. These extensions include scenarios where the rewards for actions change over time or the presence of other agents must be repeatedly considered. In these contexts, the exploration-exploitation trade-off has a more complicated role in terms of maximising finite-time performance. For example, the amount of exploration required will constantly change in a dynamic decision problem, in multiagent problems agents can explore by communication, and in repeated games, the exploration-exploitation trade-off must be jointly considered with game theoretic reasoning. Existing techniques for balancing exploration-exploitation are focused on achieving desirable asymptotic behaviour and are in general only applicable to basic decision problems. The most flexible state-of-the-art approaches, έ-greedy and έ-first, require exploration parameters to be set a priori, the optimal values of which are highly dependent on the problem faced. To overcome this, we construct a novel algorithm, έ-ADAPT, which has no exploration parameters and can adapt exploration on-line for a wide range of problems. έ-ADAPT is built on newly proven theoretical properties of the έ-first policy and we demonstrate that έ-ADAPT can accurately learn not only how much to explore, but also when and which actions to explore

    Frequency-Domain Stochastic Modeling of Stationary Bivariate or Complex-Valued Signals

    Get PDF
    There are three equivalent ways of representing two jointly observed real-valued signals: as a bivariate vector signal, as a single complex-valued signal, or as two analytic signals known as the rotary components. Each representation has unique advantages depending on the system of interest and the application goals. In this paper we provide a joint framework for all three representations in the context of frequency-domain stochastic modeling. This framework allows us to extend many established statistical procedures for bivariate vector time series to complex-valued and rotary representations. These include procedures for parametrically modeling signal coherence, estimating model parameters using the Whittle likelihood, performing semi-parametric modeling, and choosing between classes of nested models using model choice. We also provide a new method of testing for impropriety in complex-valued signals, which tests for noncircular or anisotropic second-order statistical structure when the signal is represented in the complex plane. Finally, we demonstrate the usefulness of our methodology in capturing the anisotropic structure of signals observed from fluid dynamic simulations of turbulence.Comment: To appear in IEEE Transactions on Signal Processin

    A Power Variance Test for Nonstationarity in Complex-Valued Signals

    Full text link
    We propose a novel algorithm for testing the hypothesis of nonstationarity in complex-valued signals. The implementation uses both the bootstrap and the Fast Fourier Transform such that the algorithm can be efficiently implemented in O(NlogN) time, where N is the length of the observed signal. The test procedure examines the second-order structure and contrasts the observed power variance - i.e. the variability of the instantaneous variance over time - with the expected characteristics of stationary signals generated via the bootstrap method. Our algorithmic procedure is capable of learning different types of nonstationarity, such as jumps or strong sinusoidal components. We illustrate the utility of our test and algorithm through application to turbulent flow data from fluid dynamics

    Separating Mesoscale and Submesoscale Flows from Clustered Drifter Trajectories

    Get PDF
    Drifters deployed in close proximity collectively provide a unique observational data set with which to separate mesoscale and submesoscale flows. In this paper we provide a principled approach for doing so by fitting observed velocities to a local Taylor expansion of the velocity flow field. We demonstrate how to estimate mesoscale and submesoscale quantities that evolve slowly over time, as well as their associated statistical uncertainty. We show that in practice the mesoscale component of our model can explain much first and second-moment variability in drifter velocities, especially at low frequencies. This results in much lower and more meaningful measures of submesoscale diffusivity, which would otherwise be contaminated by unresolved mesoscale flow. We quantify these effects theoretically via computing Lagrangian frequency spectra, and demonstrate the usefulness of our methodology through simulations as well as with real observations from the LatMix deployment of drifters. The outcome of this method is a full Lagrangian decomposition of each drifter trajectory into three components that represent the background, mesoscale, and submesoscale flow

    Detecting outlying demand in multi-leg bookings for transportation networks

    Get PDF
    Network effects complicate demand forecasting in general, and outlier detection in particular. For example, in transportation networks, sudden increases in demand for a specific destination will not only affect the legs arriving at that destination, but also connected legs nearby in the network. Network effects are particularly relevant when transport service providers, such as railway or coach companies, offer many multi-leg itineraries. In this paper, we present a novel method for generating automated outlier alerts, to support analysts in adjusting demand forecasts accordingly for reliable planning. To create such alerts, we propose a two-step method for detecting outlying demand from transportation network bookings. The first step clusters network legs to appropriately partition and pool booking patterns. The second step identifies outliers within each cluster to create a ranked alert list of affected legs. We show that this method outperforms analyses that independently consider each leg in a network, especially in highly-connected networks where most passengers book multi-leg itineraries. We illustrate the applicability on empirical data obtained from Deutsche Bahn and with a detailed simulation study. The latter demonstrates the robustness of the approach and quantifies the potential revenue benefits of adjusting for outlying demand in networks

    A multivariate pseudo-likelihood approach to estimating directional ocean wave models

    Get PDF
    Ocean buoy data in the form of high frequency multivariate time series are routinely recorded at many locations in the world's oceans. Such data can be used to characterise the ocean wavefield, which is important for numerous socio-economic and scientific reasons. This characterisation is typically achieved by modelling the frequency-direction spectrum, which decomposes spatiotemporal variability by both frequency and direction. State-of-the-art methods for estimating the parameters of such models do not make use of the full spatiotemporal content of the buoy observations due to unnecessary assumptions and smoothing steps. We explain how the multivariate debiased Whittle likelihood can be used to jointly estimate all parameters of such frequency-direction spectra directly from the recorded time series. When applied to North Sea buoy data, debiased Whittle likelihood inference reveals smooth evolution of spectral parameters over time. We discuss challenging practical issues including model misspecification, and provide guidelines for future application of the method

    The debiased Whittle likelihood

    Get PDF
    The Whittle likelihood is a widely used and computationally efficient pseudolikelihood. However, it is known to produce biased parameter estimates with finite sample sizes for large classes of models. We propose a method for debiasing Whittle estimates for second-order stationary stochastic processes. The debiased Whittle likelihood can be computed in the same O(n log n) operations as the standard Whittle approach. We demonstrate the superior performance of our method in simulation studies and in application to a large-scale oceanographic dataset, where in both cases the debiased approach reduces bias by up to two orders of magnitude, achieving estimates that are close to those of the exact maximum likelihood, at a fraction of the computational cost. We prove that the method yields estimates that are consistent at an optimal convergence rate of n(-1/2) for Gaussian processes and for certain classes of non-Gaussian or nonlinear processes. This is established under weaker assumptions than in the standard theory, and in particular the power spectral density is not required to be continuous in frequency. We describe how the method can be readily combined with standard methods of bias reduction, such as tapering and differencing, to further reduce bias in parameter estimates
    corecore